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An empirical criterion is discussed that may be used to assess the significance 
of a regression model term of fitted strain gage outputs of a wind tunnel balance. 
The criterion is based on the percent contribution of a regression model term. It 
considers a term insignificant if the percent contribution is below the threshold 
of 0.05 %. The criterion has the advantage that it can easily be computed using 
the regression coefficients of the strain gage outputs and the load capacities of 
the balance. First, a detailed definition of the empirical criterion is provided. 
Then, the empirical criterion is compared with a more rigorous criterion that is 
traditionally used in linear regression analysis for the assessment of the statistical 
significance of a regression model term. Finally, calibration data from a variety 
of balances is used to illustrate the connection of the empirical criterion to real 
world data sets. A preliminary review of these results indicates that the percent 
contribution threshold of 0.05 % may still be too large for the assessment of 
the significance of terms of some balance calibration data sets. Therefore, it is 
recommended to apply the rigorous criterion whenever regression model term 
reduction for the prevention of over— fitting needs to be performed. 


Nomenclature 

axial force component of force balance 

coefficient of the regression model of a gage output, defined in Ref. [1], Eq. (3.1.3) 

coefficient of the regression model of a gage output, defined in Ref. [1], Eq. (3.1.3) 

coefficient of the regression model of a gage output, defined in Ref. [1], Eq. (3.1.3) 

index of a gage output -or- index of a primary gage load 

index of a load component 
index of a load component 
capacity of a primary gage load of a balance 
number of load components of a balance 
forward normal force component of force balance 
aft normal force component of force balance 

percent contribution of a coefficient of the regression model of a gage output 
strain-gage outputs of force balance 
rolling moment component of force balance 
forward side force component of force balance 
aft side force component of force balance 

Summary 

Different approaches are used in the wind tunnel testing community to perform a regression analysis of 
wind tunnel strain-gage balance calibration data. Many analysts prefer the Iterative Method as this approach 
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fits the strain-gage outputs as a function of the calibration loads (see Ref. [1] for a detailed description of 
the method). 

The regression analysis of the strain-gage outputs is a multivariate problem. Figure 1 shows, for 
example, a typical term selection that an analyst may make for the analysis of a balance calibration data 
set of a force balance. The term selection takes into account that combined loadings were only applied to 
the normal and side force components. 

Different function classes like linear terms, absolute value terms, square terms, and cross-product terms 
may be used to assemble a suitable regression model. Ultimately, a set of regression model terms has to be 
selected such that (i) the behavior of the balance can be characterized correctly, (ii) near-linear dependencies 
between terms are suppressed, and (iii) the balance calibration data is not overfitted. Therefore, analysts 
use a host of different empirical and rigorous criteria to select individual terms of the regression model of 
the gage outputs. 

Sometimes, an analyst may use an empirical criterion for the term selection (or reduction) that has 
originated in the wind tunnel testing community. It is based on the percent contribution of a regression 
coefficient. This empirical criterion can be summarized as follows: 

Empirical Criterion for Regression Model Term Significance 

A term of the regression model of a strain-gage output is considered “significant” 
if the absolute value of the percent contribution of the term is greater than 0.05%. 


The percent contribution is a metric that can easily be computed for a given balance calibration data 
set using corresponding coefficients of the regression model of the data (see App. 1 for more detail). It is 
only a function of (i) the load capacities of the balance and (ii) the regression coefficient values of the least 
squares fit of the balance calibration data. Figure 2 shows, for example, the percent contribution if the terms 
selected in Fig. 1 are used for the analysis of the calibration data of the Ames MK40 balance. 

The origin and justification of the empirical criterion defined above was not obvious to the authors. 
Therefore, they decided to study the validity of the empirical criterion by using a more rigorous criterion 
from linear regression analysis that is used in statistics to assess the significance of regression model terms. 
The application of the rigourous criterion is, of cause, more complex than the application of the empirical 
criterion. Therefore, only basic ideas behind the rigorous criterion are presented. 

In principle, the rigorous criterion assesses the significance of terms of the regression model by looking at 
the standard error of each regression coefficient. The standard error is an estimate of the standard deviation 
of the coefficient. It can be thought of as a measure of the precision with which the regression coefficient is 
measured. A coefficient should be included in the math model if it is large compared to its standard error. 

The f-statistic of a coefficient is used to quantitatively compare a regression coefficient with its standard 
error. It equals the ratio between the coefficient value and its standard error. A coefficient is probably 
“significant” if its f-statistic is greater than the critical value of a Student’s t-distribution (see Ref. [2], 
p.32). This comparison can also be performed by using the p-value of the coefficient. The p-value of a 
coefficient is determined from a comparison of the f-statistic with values in a Student’s f-distribution. With 
a p-value of, e.g., 0.0001 (or 0.01 %) one can say with a 99.99 % probability of being correct that the 
regression coefficient is having some effect. Finally, the rigourous criterion for testing the significance of a 
regression model term can be summarized as follows: 

Rigorous Criterion for Regression Model Term Significance 

A term of the regression model of a strain-gage output is considered “significant” 
if the p value of the f-statistic of the regression coefficient is less than 0.0001. 


A connection between the rigorous and empirical criterion needs to be established so that the rigorous 
criterion may be used to evaluate the empirical criterion. A connection can be defined if, for example, the 
smallest percent contribution is found for a given regression model of strain-gage outputs that still satisfies 
the rigorous criterion. Consequently, terms below this percent contribution will no longer satisfy the rigorous 
criterion, i.e. , p-value < 0.0001, and, consequently, are considered statistically insignificant. 
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Balance calibration data sets of different balance types were used to study the connection between the 
rigorous and empirical criterion. Table 1 below shows some results that were obtained so far. 


Table 1: Strain-gage balance data analysis examples. 


BALANCE 

NAME 

BALANCE 

TYPE 

NUMBER OF 
DATA POINTS 

REGRESSION MODEL 
TERM COMBINATION 

SMALLEST PERCENT CONTRIBUTION 
THAT MET RIGOROUS CRITERION: 

p-value of t-statistic < 0.0001 

NTF-A 

SINGLE-PIECE 

410 

F, F 2 , F-G 

0.04 % 

MK-40 

MULTI-PIECE 

164 

F, F , F 2 , F ■ 

G 

0.07 % 

MC-60E 

HI— CAP 

1906 

F, F , F 2 , F- 

G 

0.03 % 

MC— 110 

SEMISPAN 

1133 

F, F 2 , F-G 

0.02 % 

MC— 400 

SEMISPAN 

498 

F, F 2 , F-G 

0.08 % 


Overall, it appears that the empirical criterion is close to the arithmetic mean of the smallest percent 
contributions that still satisfied the rigorous criterion for the investigated data sets. However, it also appears 
that smaller values of the percent contribution threshold should be chosen for some balance calibration data 
sets. Therefore, the authors recommend that the rigorous criterion instead of the empirical criterion should 
be applied whenever regression model term reduction for the prevention of over- fitting of balance calibration 
data needs to be performed. 

A wider variety of balance calibration data sets will be discussed in the final manuscript of the proposed 
conference paper. In addition, the influence of the regression model term type selection on the percent 
contribution threshold will be investigated in more detail. 
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Appendix — Definition of the Percent Contribution 


The percent contribution describes the contribution of each term of the regression model of the gage 
outputs to the total fitted value, expressed as a percentage of the contribution of the principle diagonal term. 
It is used to assess the degree of linearity of the regression model of the gage outputs. 

The percent contribution is defined as the ratio of two numerical values. This ratio is expressed as a 
percentage. The percent contribution only depends on (i) the known capacities of the load components of 
the balance and (ii) the regression coefficients that are the result of the regression analysis of the balance 
calibration data. For convenience, the regression coefficient nomenclature introduced in Ref. [1], Eq. (3.1.3), 
may be used to illustrate the calculation of the percent contribution. Then, the first numerical value can be 
expressed as the product of the regression coefficient of the primary linear term u bl(i,i)” of the regression 
model of the gage output with the capacity u K(i)” of the related balance load component. We get: 

Q(i) = bl(i,i) ■ K(i) (1) 

Equation (1) defines the reference value that is used to investigate the linearity of the regression model 
of the gage output. It is the denominator of the ratio that defines the percent contribution. The numerator 
of the ratio, on the other hand, is the product of the investigated regression coefficient with the related 
regressor variable value that is obtained by using the load capacities as applied loads. Now, the percent 
contribution of the ten math term type groups defined in Ref. [1], Eq. (3.1.3), can be summarized as follows: 
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where j = 1, > • • , n and k = j + 1, • • • , n. The numerator of the percent contribution, i.e., Q(i), depends on 
the gage output index i. It changes whenever the index of the gage output changes during the calculation 
of the percent contribution. 
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Fig. 1 Regression model term choice for the analysis of the MK40 balance calibration data. 
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Fig. 2 Percent contribution of regression model terms of the strain-gage outputs. 
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Evaluation of Regression Models of Balance 
Calibration Data using an Empirical Criterion 


N. Ulbrich* and T. Volden** 

Jacobs Technology Inc., Moffett Field , California 9f035-1000 

An empirical criterion for assessing the significance of individual terms of 
regression models of wind tunnel strain gage balance outputs is evaluated. The 
criterion is based on the percent contribution of a regression model term. It 
considers a term to be significant if its percent contribution exceeds the empirical 
threshold of 0.05 %. The criterion has the advantage that it can easily be 
computed using the regression coefficients of the gage outputs and the load 
capacities of the balance. First, a definition of the empirical criterion is provided. 
Then, it is compared with an alternate statistical criterion that is widely used 
in regression analysis. Finally, calibration data sets from a variety of balances 
are used to illustrate the connection between the empirical and the statistical 
criterion. A review of these results indicated that the empirical criterion seems 
to be suitable for a crude assessment of the significance of a regression model 
term as the boundary between a significant and an insignificant term cannot be 
defined very well. Therefore, regression model term reduction should only be 
performed by using the more universally applicable statistical criterion. 
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Nomenclature 

= axial force component of a force balance 

= coefficients of the regression model of a gage output, defined in Ref. [1], Eq. (3.1.3) 
= capacity of a primary gage load of a balance 

= coefficients of the regression model of a gage output, defined in Ref. [1], Eq. (3.1.3) 
= coefficients of the regression model of a gage output, defined in Ref. [1], Eq. (3.1.3) 
= generic strain-gage balance load symbols 
= index of a gage output -or- index of a primary gage load 
= index of a load component 
= number of load components of a balance 
= forward normal force component of a force balance 
= aft normal force component of a force balance 

= percent contribution of a coefficient of the regression model of a gage output 
= strain-gage outputs of a force balance 
= rolling moment component of a force balance 
= forward side force component of a force balance 
= aft side force component of a force balance 

= regression coefficients of the output of the forward normal force gage 
= regression coefficients of the output of the axial force gage 
= indicator variable for bi-directionality of a primary gage output 

I. Introduction 


Different approaches are used in the wind tunnel testing community to perform the regression analysis 


* Aerodynamicist, Jacobs Technology Inc. 

** Computer Engineer, Jacobs Technology Inc. 
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of wind tunnel strain-gage balance calibration data. Many analysts prefer the Iterative Method as this 
approach fits strain-gage outputs as a function of calibration loads (see Ref. [1] for a detailed description 
of the method). The solution of the corresponding global regression analysis problem defines coefficients of 
a multivariate math model for each gage output as the individual outputs of a typical wind tunnel balance 
respond to more than one calibration load component. 

Function classes like linear terms, absolute value terms, square terms, and cross-product terms may 
be used to assemble a suitable regression model of the strain-gage outputs. Ultimately, a set of regression 
model terms has to be selected such that (i) the behavior of the strain-gage outputs of the balance can 
be characterized correctly, (ii) near-linear dependencies between terms are avoided, and (iii) the balance 
calibration data is not over fitted. Analysts use a host of different criteria to both evaluate and select 
individual terms of the regression model of the gage outputs so that the final regression model meets the 
chosen quality requirements. 

Traditionally, an empirical criterion is applied in the wind tunnel testing community to assess the 
significance of individual terms of the regression model of the outputs whenever the Iterative Method is chosen 
for the analysis of balance calibration data. This criterion is based on the so-called percent contribution of 
a regression model term (see the appendix for a definition of the percent contribution) . Some analysts prefer 
to apply a statistical criterion in order to assess the importance of individual terms of a regression model. 
This alternate criterion is widely used in linear regression analysis. 

At this point a question may come up: how does the final term selection obtained from the application 
of the empirical criterion compare with the term selection that would result from the application of the 
statistical criterion ? The present paper tries to find an answer to this question by applying both term selec- 
tion criteria to a variety of balance calibration data sets. In addition, the selection of the terms resulting from 
the application of the two criteria is evaluated by using some knowledge about the physical characteristics 
of a specific balance design. 

In the next section of the paper the empirical and the statistical criterion are defined. Then, data from 
the calibration of NASA’s MK29B balance is used to compare subsets of recommended terms that were 
obtained after the application of the two criteria with the term selection that results from knowledge about 
the design of the balance. Finally, results from the analysis of 12 different balances are presented that assess 
the validity of the percent contribution threshold that the empirical criterion uses. 

II. Definition of Empirical and Statistical Criterion 

Basic definitions of the empirical and statistical criterion are reviewed in this section so that a meaningful 
comparison of the results of the application of the criteria can be made. In principle, both criteria may 
independently be used to eliminate insignificant terms in the regression model of the outputs of a strain- 
gage balance so that over- fitting of balance calibration data is prevented. 

First, the empirical criterion for the assessment of regression model terms is discussed. It is based on the 
percent contribution of a regression coefficient (see the appendix of this paper for a definition of the percent 
contribution). The criterion compares the absolute value of the percent contribution of an individual term 
of a regression model with an empirical threshold in order to decide whether or not the term is significant 
and should be retained in the model. The empirical criterion can be summarized as follows: 

Empirical Criterion for Term Significance 

A term of the regression model of a strain-gage output is considered “significant” 
if the absolute value of the percent contribution of the term is greater than 0.05%. 


The use of the percent contribution for the assessment of the significance of regression model terms has 
an advantage: it is a metric that can easily be computed for a given regression model term. It is only a 
function of (i) the load capacities of the balance and (ii) the regression coefficient values of the least squares 
fit of the strain-gage outputs. 

The application of the statistical criterion, on the other hand, is more complex than the application of 
the empirical criterion. The statistical criterion assesses the significance of terms of a regression model by 
looking at the standard error of each regression coefficient. The standard error is an estimate of the standard 
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deviation of the coefficient. It can be thought of as a measure of the precision with which the regression 
coefficient is measured. A coefficient should be included in the math model if it is large compared to its 
standard error. 

The t-statistic of a coefficient is used to quantitatively compare a regression coefficient with its standard 
error. It equals the ratio between the coefficient value and its standard error. A coefficient is probably 
“significant” if its t-statistic is greater than the critical value of the Student’s t-distribution (see Ref. [2], 
p.32). This comparison can also be performed by using the p-value of the coefficient. The p-value of a 
coefficient is determined from a comparison of the t-statistic with values in the Student’s t-distribution. 
With a p-value of, e.g., 0.0001 (or 0.01 %) one can say with a 99.99 % probability of being correct that the 
regression coefficient is having some effect. The statistical criterion for testing the significance of a regression 
model term can be summarized as follows: 

Statistical Criterion for Term Significance 

A term of the regression model of a strain— gage output is considered “significant” 
if the p value of the t-statistic of its regression coefficient is less than 0.0001, 
or, if 1.0 minus the p — value of the t— statistic of its coefficient is greater than 0.9999. 


The statistical criterion applies the p-value threshold of 0.0001 during the assessment of the significance 
of a term. This threshold choice is universally accepted and widely used in linear regression analysis (it 
is, for example, applied in the highly popular regression analysis software package distributed by SAS of 
Cary, North Carolina). Unfortunately, the authors were unable to find a reference for the justification of 
the percent contribution threshold of 0.05 % that the empirical criterion uses. Therefore, they decided to 
study the validity of the empirical criterion’s threshold by using results from the application of the statistical 
criterion as a point of reference. 

First, the two criteria’s regression model term assessments are compared for a case when information 
about the design of a strain-gage balance is available. Then, results from the application of the two criteria 
to a variety of balance calibration data sets are discussed. 

III. Balance Calibration Data Example 

Machine calibration data of NASA’s MK29B balance was selected for the assessment of the two criteria. 
The MK29B balance is a six-component force balance that was manufactured by the Task Corporation. 
The calibration data used for the study was obtained in Triumph Aerospace’s balance calibration machine. 
Table 1 below summarizes important characteristics of the balance and calibration data set that was used 
for the present study. 

Table 1: Balance and calibration data set characteristics of the MK29B balance. 


BALANCE NAME 

2.0 MK29B 

BALANCE TYPE 

FORCE BALANCE (TASK) 

DIAMETER 

2.0 [in] 

GAGE DISTANCE (NORMAL FORCE GAGES) 

7.25 [in] 

GAGE DISTANCE (SIDE FORCE GAGES) 

6.00 [in] 

CALIBRATION DATE 

SEPTEMBER 2007 

CALIBRATION METHOD 

MACHINE CALIBRATION 

TOTAL NUMBER OF CALIBRATION POINTS 

1751 


The calibration data set makes a complete characterization of the physical behavior of the balance 
possible because it covers the entire use envelope that the balance may experience during a wind tunnel 
test. It was decided to process the data in its “design” format, i.e., in force balance format. In this case 
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important connections between the primary gage loads and gage outputs become visible. Table 2 below lists 
load capacities of the balance in force balance format: 

Table 2: Capacities of the MK29B balance. 



VI, lbs 

V2, lbs 

SI, lbs 

S2, lbs 

RM , in-lbs 

AF , lbs 

CAPACITY 

2100 

2100 

700 

700 

3800 

350 


Different math term group combinations may be selected for the development of a regression model of 
the six strain-gage outputs of the balance. The math term groups themselves are defined in Ref. [1]. One of 
five group combination choices may be chosen. These five combinations are summarized in Table 3 below. 


Table 3: Math Term Group Combination Choices for Initial Data Analysis. 


GROUP 

NUMBER 

MATH TERM GROUP COMBINATION 

(I) 

F, F ■ G 

(II) 

F, F 2 , F-G 

(HI) 

F, \F\,F 2 ,F-G 

(IV) 

F, |F|, F 2 , F-\F\, F-G 

(V) 

F, m F 2 , F ■ \F\, F ■ G, F ■ G , \F\ ■ G, \F\ ■ \G\ 


Some of the gage outputs of the MK29B are bi-directional. Therefore, absolute value terms should be 
a part of the group combination for an initial analysis of the calibration data. It was decided to use group 
(V) for the analysis. A numerical technique called Singular Value Decomposition was applied to ensure 
that the regression model of each gage output would be free of linear dependencies between terms. Then, 
the initial regression model of each of the six gage outputs of the MK29B balance supports a total of 85 
terms. Figure la shows the corresponding regression model term selection for the MK29B. It is the authors’ 
experience that only about 1/4 to 1/3 of the terms are needed to correctly describe the physical behavior of 
the gage outputs of the balance. 

The MK29B balance is a Task balance. Therefore, only the normal and side force gage outputs of 
the balance will have significant levels of bi-directionality (see Ref. [3] for a discussion of bi-directionality 
characteristics of different balance types). Figure lb shows the bi-directionality characteristics of the MK29B. 
It can clearly be seen that the regression models of the normal and side force gage outputs may need absolute 
value terms as the indicator variable A for bi-directionality has the appearance of an absolute value function. 

For the present investigations it was decided to focus on the regression models of outputs of the forward 
normal force gage (i?l) and the axial force gage (i?6). First, the application of the empirical and statistical 
criterion to the outputs of the forward normal force gage is investigated in detail. 

Figure 2a shows a comparison of p - -value and percent contribution for regression model of the forward 
normal force gage output. The orange color highlights the values for the primary gage load (VI). We see 
that the p-value is less than the threshold of 0.0001 and the percent contribution is 100.00 %. Therefore, 
both criteria say that the term is significant. The green color highlights all terms that are significant based 
on an application of the statistical criterion. A total of 26 of the 85 terms are significant according to the 
statistical criterion. The empirical criterion is also fulfilled for all 26 terms as the absolute value of the 
percent contribution of each one of the 26 terms exceeds the threshold of 0.05 %. 

It is interesting to study the connection between a set of cross-product terms that are a part of the 
regression model of the forward normal force gage output. Therefore, the significance of the following four 
cross-product terms is evaluated: VI V2, VI |V2|, |V1| V2, and |V1 V2|. The yellow color in Fig. 2b 
highlights values for both the p - value and percent contribution of all terms that are connected with the four 
cross-product terms (the numerical values reported in Fig. 2a and 2b are identical). For clarity, these values 
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may be listed in table format. First, Table 4a compares the significance of the four linear terms that are 
indirectly related to the four cross-product terms. 

Table 4a: Significance of selected linear terms of regression model of Rl. 


MATH TERM 

p-value (p) 

1 ~P 

percent contribution 

N1 

< 0.0001 

> 

0.9999 

> 0.05 % (100.00 %) 

N2 

< 0.0001 

> 

0.9999 

> 0.05 % (3.10 %) 

\N1\ 

< 0.0001 

> 

0.9999 

> 0.05 % (0.98 %) 

\N2\ 

< 0.0001 

> 

0.9999 

> 0.05 % (0.64 %) 


It can be seen that both the statistical criterion (p-value < 0.0001) and the empirical criterion (percent 
contribution > 0.05 %) are fulfilled for all four terms. They are significant as far as the regression model 
of the forward normal force gage output is concerned. It remains to inspect the percent contribution and 
p - value of the cross-product terms themselves. These values are listed in Table 4b. 

Table 4b: Significance of selected cross-product terms from regression model of Rl. 


MATH TERM 

p-value (p) 

1-p 

percent contribution 

N1 N2 

< 0.0001 

> 0.9999 

> 0.05 % (0.23 %) 

7V1 N2\ = |M| |AT2| 

< 0.0001 

> 0.9999 

> 0.05 % (0.24 %) 

N1 \N2\ 

< 0.0001 

> 0.9999 

> 0.05 % (0.37 %) 

7V1 N2 

< 0.0001 

> 0.9999 

> 0.05 % (0.26 %) 


Again, as it was the case with the terms listed in Table 4a, it can be seen that both the statistical 
criterion (p-value < 0.0001) and the empirical criterion (percent contribution > 0.05 %) are fulfilled for all 
four terms. All four cross-product terms are significant as far as the regression model of the forward normal 
force gage output is concerned. The reason for the importance of the four cross-product terms can be better 
understood if we look at their origin. They are the cross-product terms that are obtained after taking the 
bi-directionality of the gage outputs of the normal force gages into account. Thus, we get: 


a N 1 + 0 | ATI | 


eN 2 + A |JV2| 


(cue) N1 N 2 + (aA) N 1 \N2\ 

+ {fie) | ATI | N2 + (f3\) |JV1| |iV2| 


The output of the forward normal force gage is simply sensitive to the loads that are applied over both 
the forward and aft normal force gage. Therefore, all four cross-product terms are needed in the regression 
model of the forward normal force output. 

Situations exist, however, when not all related cross-product terms are needed in the regression model 
of a gage output. The regression model of the axial gage output may be used to illustrate this statement. 
Figure 3a shows a comparison of n -value and percent contribution for the regression model of the axial gage 
output. The orange color highlights the values for the primary gage load (AF). We see that the p - value is 
less than the threshold of 0.0001 and the percent contribution is 100 %. Therefore, both criteria say that the 
term is significant. The green color highlights all terms that are significant based on an application of the 
statistical criterion. A total of 30 of the 85 terms are considered significant based on the statistical criterion. 
The empirical criterion is also fulfilled for all 30 terms as the absolute value of the percent contribution of 
each one of the 30 terms exceeds the threshold of 0.05 %. 

Now, the significance of the following four cross-product terms is evaluated: N 2 AF, N 2 \AF\, 
|iV2| AF, and 1 7V2 1 \AF\. The yellow and blue color in Fig. 3b highlights values for both the n -value 
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and percent contribution of terms that make up the four cross-product terms (the numerical values reported 
in Fig. 3a and 3b are identical). For clarity, these values may be listed in table format. First, Table 5a 
compares the significance of the four linear terms that are indirectly related to the four cross-product terms. 


Table 5a: Significance of selected linear terms from regression model of R6. 


MATH TERM 

p-value (p) 

1 ~P 

percent contribution 

N2 

< 0.0001 

> 0.9999 

> 0.05 % (5.13 %) 

AF 

< 0.0001 

> 0.9999 

> 0.05 % (100.00 %) 

\N2\ 

< 0.0001 

> 0.9999 

> 0.05 % (0.77 %) 

\AF\ 

0.0741 

0.9259 

> 0.05 % (0.08 %) 


The empirical criterion (percent contribution > 0.05 %) is fulfilled for all four terms that are listed 
above. The result for the statistical criterion, i.e., p-value < 0.0001, is different. Only three of the four 
terms appear to be significant. The statistical criterion identified the term |AFj as an insignificant term in 
the regression model of the axial gage output. It remains to inspect the percent contribution and p- value 
of the cross-product terms themselves before a final evaluation of the discrepancy between the two criteria 
can be made. These values are listed in Table 5b. 

Table 5 b: Significance of selected cross-product terms from regression model of R6. 


MATH TERM 

p-value (p) 

1 -v 

percent contribution 

N2 AF 

< 0.0001 

> 0.9999 

> 0.05 % (0.30 %) 

\N2 AF\ = \N2\ \AF\ 

0.0074 

0.9926 

> 0.05 % (0.12 %) 

N2 \AF\ 

0.0841 

0.9159 

> 0.05 % (0.06 %) 

\N2\ AF 

0.0010 

0.9990 

> 0.05 % (0.13 %) 


Again, the empirical criterion (percent contribution > 0.05 %) says that all four cross-product terms 
are significant. The statistical criterion, on the other hand, is only fulfilled for a single term ( N2 AF). It 
considers the remaining three cross-product terms (N 2 \AF\, |7V2| AF, \N2\ |AF|) to be insignificant. 

The reason for the discrepancy between the results for the empirical and statistical criterion has to be 
investigated in more detail. The discrepancy can be better understood if we look at the origin of the four 
cross-product terms. We get: 


p N2 + u \N2\ 


f AF + p \AF\ J = (p£) N2jiF + (pp) N2 \AF\ 

p<0.0001 p=0.0841 

+ (i/£) |iV2| AF + (up) |iV2| \AF\ 

p=0.0010 p=0.0074 


(2) 


From Fig. lb it is known that the axial gage output of Task balance like the MK29B is not bi-directional. 
Therefore, there is no objective justification for using an absolute value term of the axial force in the regression 
model. This conclusion is also supported by the corresponding p-value that is listed in Table 5a: 


p- value of | AF\ = 0.0741 » 0.0001 


P \AF\ 


0 


(3) 


Now, after using the result reported in Eq. (3) in Eq. (2), we get the following result for the cross-product 
terms: 
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n N2 + V \N2\ 1 • [ 5 AF + 0 


(4) 


« (n£) 7V2AF + (i/f) \N2\ AF 

p<0.0001 p=0.0010 

A further investigation of the p - value and the percent contribution of the cross-product term \N2\ AF 
was inconclusive. Its p-value is 0.0010 and its percent contribution is 0.13 %. Therefore, the term is a 
borderline case between being significant and being insignificant as the p-value and the percent contribution 
are relatively close to their respective thresholds. 

The analysis of the cross-product terms of N2 and AF of the regression model of the axial gage output 
also illustrated that the four theoretically possible cross-product terms N2 AF, N2 \AF\, \N2\ AF, and 
|iV2| \AF\ do not have to be present simultaneously in the regression model of the gage outputs. Some of 
the four terms may simply be omitted if either the statistical or the empirical criterion indicate they are 
insignificant. The omission of the two terms N2 \AF\ and |iV2| \AF\ in Eq. (4) is also supported by a known 
design characteristic of the axial flexure element of the MK29B Task balance. The axial gage output of the 
MK29B is not bi-directional, i.e., the term \AF\ in its regression model is insignificant, because the axial 
flexure element is joined to the metric and non-metric part of the balance by using tight press pins. 

In the next section it is explained how a connection between the threshold of the statistical criterion 
and the threshold of the empirical criterion can be established. This connection is first illustrated by using 
the MK29B data set. Then, results of a survey of 12 different balance calibration data sets are discussed. 
The data sets were processed to evaluate the magnitude of the empirical criterion’s threshold. 

IV. Survey of Balance Calibration Data 

The authors performed a survey of different balance calibration data sets to investigate whether or not 
the empirical criterion’s threshold of 0.05 % is applicable to a wide variety of balance calibration data sets. 
Therefore, it was decided to define a connection between the threshold of the statistical criterion and the 
threshold of the empirical criterion so that the statistical criterion may be used to evaluate the empirical 
criterion. 

Different approaches may be used to establish a connection between the statistical and the empirical 
criterion. One approach, for example, counts the number of regression model terms that simultaneously 
satisfy the statistical and the empirical criterion for a given percent contribution threshold choice. This 
process is repeated over the entire range of expected percent contribution thresholds (from « 0.001 % to 
100.0 %). Then, the resulting term count is plotted versus the corresponding percent contribution threshold 
choice. The “ideal” percent contribution threshold is the largest threshold that, if chosen, still maximizes 
the term count. 

Results for the regression model of the axial gage output of the MK29B balance may be used to demon- 
strate the approach that is described in the previous paragraph. The p-values and percent contributions 
listed in Fig. 3a are used for this study. First, however, in order to highlight important characteristics of 
the empirical criterion itself, the number of significant terms of the 85 term model of the axial gage output 
is counted by only applying the empirical criterion over the expected percent contribution threshold range. 
The intercept term is intentionally not counted during this investigation because its p - value is not computed. 
Therefore, the maximum of the term count becomes 84. Figure 4a shows the result of these calculation. It 
can be seen that the term count remains constant as long as the percent contribution threshold is below 
the minimum that is obtained for the regression model of the axial gage output (the minimum is listed as 
0.00406 % in Fig. 3a). Then, the term count monotonically decreases as the threshold increases. It reaches 
the theoretical minimum of zero when the percent contribution threshold is set to 100 %. 

Now, using the approach proposed in the second paragraph of this section, the terms are counted that 
simultaneously satisfy the statistical and the empirical criterion for a given percent contribution threshold 
choice. Figure 4b shows corresponding results. It can be seen that the term count first remains constant as 
the percent contribution threshold increases. At a certain value, i.e., at ss 0.13 %, the term count suddenly 
starts to decrease. This threshold is the “ideal” percent contribution threshold that should be used if the 
empirical criterion is applied to the regression model of the axial gage outputs. It defines the boundary 
between “over fitting” and “under-fitting” of the axial gage outputs. A decrease of the percent contribution 
threshold, for example, would result in the inclusion of insignificant terms in the regression model. These 
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additional terms could cause “over fitting” of the axial gage outputs. An increase of the percent contribution 
threshold, on the other hand, would result in the omission of significant terms in the regression model. These 
missing terms could cause “under-fitting” of the axial gage outputs. 

The approach discussed in the previous three paragraphs is complex. An alternate and simpler connec- 
tion between the statistical and empirical criterion can also be defined if, for example, the smallest percent 
contribution is found for a given regression model of the strain-gage outputs that still satisfies the statistical 
criterion. Consequently, terms below this percent contribution will no longer satisfy the statistical criterion, 
i.e., p-value < 0.0001, and are considered statistically insignificant. 

The connection between the two thresholds can be illustrated by using the calibration data set of the 
MK29B that is discussed in the previous section. Figure 2a shows the percent contributions of the regression 
model of the forward normal force gage output of the MK29B balance. The smallest absolute value of the 
percent contribution that satisfied the statistical criterion is the percent contribution of term 29, i.e., 0.09 %. 
It must be pointed out that the percent contribution of the intercept term must be ignored during the search 
for the percent contribution minimum as the p - value is not defined in this case. Similarly, the minima of the 
percent contributions of the regression models of the remaining five primary gage outputs can be determined. 
All six minima of the 85 term math model, i.e., of math term group combination (V) in Table 3, are listed 
in Table 6 below. 

Table 6: Percent contribution minima of regression models of MK29B calibration data. 



GROUP (TABLE 3) 

R 1 

R2 

A3 

A4 

A5 

R6 

ABS(MINIMUM) 

(V) 

0.09 % 

0.11 % 

0.26 % 

0.19 % 

0.20 % 

0.15 % 


The minimum of the six minima is 0.09 %. This value, if chosen as the threshold for the empirical 
criterion, would guarantee that all significant terms of the regression models of the six gage outputs are 
retained that the statistical criterion would identify for the same data set and math term group combination. 

The minimum of the six minima may change if the math term group combination is changed from, say, 
combination (V) to combination (IV). The MK29B balance is a multi-piece balance that typically needs 
absolute value terms. Therefore, the minima for group combinations (III), (IV), and (V) were investigated 
in order to get an idea of the dependency of the percent contribution threshold on the math term group 
combination. The results of these investigations are shown in the table in Fig. 5a. 

The percent contribution threshold of the empirical criterion does not just depend on the math term 
group combination that is chosen for the regression analysis of the gage outputs. It may also depend on 
the balance type, whether or not a tare load iteration was performed, and on the calibration process that a 
balance calibration laboratory uses. Therefore, calibration data sets of a total of 12 different balances were 
investigated in order to get a more general idea of the percent contribution threshold variation. 

Figure 5a shows the percent contribution thresholds that were obtained for multi-piece balance data 
sets. Figure 5b shows the percent contribution thresholds that were obtained for single-piece and hybrid 
balance data sets. Figure 5c shows the percent contribution thresholds that were obtained for semispan 
balance data sets. The results for the recommended math term group combinations of each balance type 
are highlighted in boldface. The arithmetic mean of all thresholds highlighted in boldface is 0.04 %. This 
value is very close to the original choice of 0.05 % that was listed in the original definition of the empirical 
criterion. However, it can also be concluded from the results of the different balance calibration data sets 
that the percent contribution threshold of 0.05 % is not universally applicable. Higher or lower values of the 
threshold appear to be appropriate for some of the data sets that were investigated. 

V. Summary and Conclusions 

An empirical criterion was evaluated that may be used to assess the significance of individual terms of re- 
gression models of wind tunnel strain-gage balance outputs. The criterion compares the percent contribution 
of a term with the threshold of 0.05 % in order to decide whether or not the term is significant. 

Calibration data sets from a variety of strain-gage balances were processed to test the criterion. During 
these tests the term assessment of the empirical criterion was compared with the term assessment of an 
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alternate statistical criterion. Therefore, a connection between the empirical and statistical criterion was 
first defined by identifying the smallest percent contribution for the regression models of a given balance 
calibration data set that still satisfied the statistical criterion. Then, the mean value of the percent contri- 
bution minima of all tested balance data sets was computed. It was found that this mean value, i.e., 0.04 %, 
is close to the proposed percent contribution threshold of 0.05 %. 

The study of the different balance calibration data sets illustrated that the boundary between a sig- 
nificant and an insignificant term cannot be defined very well for the empirical criterion. Therefore, the 
empirical criterion seems to be suitable for a crude assessment of the significance of a regression model term . 
Consequently, regression model term reduction for the prevention of over-fitting of balance calibration data 
should only be performed by using the more reliable and universally applicable statistical criterion . 
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Appendix — Definition of the Percent Contribution 


The percent contribution describes the contribution of each term of the regression model of the gage 
outputs to the total fitted value, expressed as a percentage of the contribution of the principle diagonal term. 
It is used to assess the degree of linearity of the regression model of the gage outputs. 

The percent contribution is defined as the ratio of two numerical values. This ratio is expressed as a 
percentage. The percent contribution only depends on (i) the known capacities of the load components of 
the balance and (ii) the regression coefficients that are the result of the regression analysis of the balance 
calibration data. For convenience, the regression coefficient nomenclature introduced in Ref. [1], Eq. (3.1.3), 
may be used to illustrate the calculation of the percent contribution. Then, the first numerical value can be 
expressed as the product of the regression coefficient of the primary linear term of the regression 

model of the gage output with the capacity “C(i) n of the related balance load component. We get: 

Q(i) = 61(*,i) ■ C{i) (5) 

Equation (5) defines the reference value that is used to investigate the linearity of the regression model 
of the gage output. It is the denominator of the ratio that defines the percent contribution. The numerator 
of the ratio, on the other hand, is the product of the investigated regression coefficient with the related 
regressor variable value that is obtained by using the load capacities as applied loads. Now, the percent 
contribution of the ten math term type groups defined in Ref. [1], Eq. (3.1.3), can be summarized as follows: 


PC(i,bl) = 

= 100 

% • 

[ bl(i,j) ■ 

C(j) } 

/Q(i) 


(6a) 

PC(i, 62) = 

= 100 

% ■ 

[ b2(i,j) ■ 

\C(j)\ } 

/ Q(i ) 


(66) 

PC(i,cl) = 

= 100 

% ■ 

[ cl{i,j) ■ 

cur ] 

/ Q(0 


(6c) 

PC(i, c2) = 

= 100 

% ■ 

[ c2(i,j) ■ 

cu ) • 

\c(j) 1 ] / 

Q(i) 

(6 d) 

PC{i, c3) = 

= 100 

% ■ 

■ [ c3(i,j,k ) 

• CU) 

• C(k) } 

/Q(i) 

(6e) 

PC\i , c4) = 

= 100 

% ■ 

[ c4(i,j, k) 

■ 1 cu) 

■ C(k) 1 ; 

1 / Q(i) 

(6/) 

PC(i,c5) = 

= 100 

% ■ 

[ c5(i,j, k ) 

■ cu) 

• \c(k )\ ; 

1 / Q(i) 

(6.9) 

PC(i,c6 ) 

= 100 

% 

• [c6(i,j,k) 

■ \C(j)\ ■ C(k) 

} / Q(i) 

(66) 

PC(i, dl) -- 

= 100 

% 

■ [ dl(i,j) ■ 

cur 

] / Q(i) 


(6 i) 

PC{i, d2) 

= 100 

% ■ 

[ d2(i,j) ■ 

\cur\ 

} / Q(i) 


(6 j) 


where j = 1 ,n and k = j + 1, • • • , n. The numerator of the percent contribution, i.e., Q(«), depends on 
the gage output index i. It changes whenever the index of the gage output changes during the calculation 
of the percent contribution. 
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Fig. la Initial regression model term choice for the analysis of the MK29B balance calibration data set. 
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Fig. lb Bi-directionality characteristics of NASA’s MK29B Task balance. 
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Fig. 2a Comparison of p-value and percent contribution for regression model of forward normal gage output Rl. 
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Fig. 2b Comparison of significance of cross-product terms Nl N2 , |7V1 N 2|, Nl |iV2|, and |7V1| N 2. 
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Fig. 3a Comparison of v -value and percent contribution for regression model of axial gage output i?6. 
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Fig. 3b Comparison of significance of cross-product terms N 2 AF, \N2 AF |, N2 \AF\, and |iV2| AF. 
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Fig. 4a Number of terms of the math model of the axial gage outputs that satisfy only the empirical criterion. 
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Fig. 4b Number of terms of the math model of the axial gage outputs that satisfy both criteria. 
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0.07 % 
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0.12 % 
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- 
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0.21 % 


'Recommended math term group combination, i.e. , the smallest group combination that 
met accuracy expectations (a definition of all group combinations can be found in Table 3). 

Fig. 5a Observed percent contribution minima for multi-piece balances. 
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- 
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'Recommended math term group combination, i.e., the smallest group combination that 
met accuracy expectations (a definition of all group combinations can be found in Table 3). 

Fig. 5b Observed percent contribution minima for single piece fc hybrid balances. 
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0.02 % 
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'Recommended math term group combination, i.e., the smallest group combination that 
met accuracy expectations (a definition of all group combinations can be found in Table 3). 

Fig. 5c Observed percent contribution minima for semispan balances. 
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